Evaluating re-identification risks with respect to the HIPAA privacy rule
نویسندگان
چکیده
OBJECTIVE Many healthcare organizations follow data protection policies that specify which patient identifiers must be suppressed to share "de-identified" records. Such policies, however, are often applied without knowledge of the risk of "re-identification". The goals of this work are: (1) to estimate re-identification risk for data sharing policies of the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule; and (2) to evaluate the risk of a specific re-identification attack using voter registration lists. MEASUREMENTS We define several risk metrics: (1) expected number of re-identifications; (2) estimated proportion of a population in a group of size g or less, and (3) monetary cost per re-identification. For each US state, we estimate the risk posed to hypothetical datasets, protected by the HIPAA Safe Harbor and Limited Dataset policies by an attacker with full knowledge of patient identifiers and with limited knowledge in the form of voter registries. RESULTS The percentage of a state's population estimated to be vulnerable to unique re-identification (ie, g=1) when protected via Safe Harbor and Limited Datasets ranges from 0.01% to 0.25% and 10% to 60%, respectively. In the voter attack, this number drops for many states, and for some states is 0%, due to the variable availability of voter registries in the real world. We also find that re-identification cost ranges from $0 to $17,000, further confirming risk variability. CONCLUSIONS This work illustrates that blanket protection policies, such as Safe Harbor, leave different organizations vulnerable to re-identification at different rates. It provides justification for locally performed re-identification risk estimates prior to sharing data.
منابع مشابه
Never too old for anonymity: a statistical standard for demographic data sharing via the HIPAA Privacy Rule
OBJECTIVE Healthcare organizations must de-identify patient records before sharing data. Many organizations rely on the Safe Harbor Standard of the HIPAA Privacy Rule, which enumerates 18 identifiers that must be suppressed (eg, ages over 89). An alternative model in the Privacy Rule, known as the Statistical Standard, can facilitate the sharing of more detailed data, but is rarely applied beca...
متن کاملThe Costs of Hipaa: to Patients, to Progress, and to the Nation’s Health
Recent studies including a 2009 Institute of Medicine report have highlighted how the HIPAA Privacy Rule fails to protect privacy and has created significant barriers to research. The purpose of this article is to outline the impact of the HIPAA Privacy Rule on patients and its cost to the research enterprise in terms of time, dollars, and lost opportunities. During this review, we found that H...
متن کاملThe Costs of Hipaa: to Patients, to Progress, and to the Nation’s Health
Recent studies including a 2009 Institute of Medicine report have highlighted how the HIPAA Privacy Rule fails to protect privacy and has created significant barriers to research. The purpose of this article is to outline the impact of the HIPAA Privacy Rule on patients and its cost to the research enterprise in terms of time, dollars, and lost opportunities. During this review, we found that H...
متن کاملDe-identification Methods for Open Health Data: The Case of the Heritage Health Prize Claims Dataset
BACKGROUND There are many benefits to open datasets. However, privacy concerns have hampered the widespread creation of open health data. There is a dearth of documented methods and case studies for the creation of public-use health data. We describe a new methodology for creating a longitudinal public health dataset in the context of the Heritage Health Prize (HHP). The HHP is a global data mi...
متن کاملPotential impact of the HIPAA privacy rule on data collection in a registry of patients with acute coronary syndrome.
BACKGROUND Implementation of the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule has the potential to affect data collection in outcomes research. METHODS To examine the extent to which data collection may be affected by the HIPAA Privacy Rule, we used a quasi-experimental pretest-posttest study design to assess participation rates with informed consent in 2 cohorts o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of the American Medical Informatics Association : JAMIA
دوره 17 2 شماره
صفحات -
تاریخ انتشار 2010